Log-periodic oscillations due to discrete effects in complex networks 
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We show that discretization of internode distribution in complex networks affects internode dis- 
tances {lij) calculated as a function of degrees kikj and an average path length (l) as function of 
network size A'^. For dense networks there are log-periodic oscillations of above quantities. We 
present real-world examples of such a behavior as well as we derive analytical expressions and com- 
pare them to numerical simulations. We consider a simple case of network optimization problem, 
arguing that discrete effects can lead to a nontrivial solution. 
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During the last few years much attention has been 
drawn to averagepath length issues in complex networks. 
Several authors [lll3jl3,lj)l3 have dealt with this problem 
using different approaches to obtain analytical expres- 
sions for average path lengths. One finds a good reason 
to explore this quantity knowing that it was a small value 
of the average path length in such systems as social 
and technological 2] networks that made scientists get 
interested in this field. Average path length can have 
different aspects, its value may be just a "chemical dis- 
tance" between routers or WWW pages ;8| but it also 
appears as "degree of separation" in acquaintances be- 
tween people 191 , number of changes in public transport 
systems or information handling in a city 12] . 

In this Letter we study a simple effect of log-periodic 
oscillations in average path lengths which we observe 
in several real-world examples. Using a formalism de- 
veloped in [3 we give a theoretical explanation of this 
feature supported by numerical simulations of scale-free 
networks with different scaling exponents. We show that 
such oscillations are due to discrete effects of path length 
distributions for networks with large average degree val- 
ues. We also study a fundamental and well known prob- 
lem of optimal network density taking into account the 
shortest average path length and the smallest number 
of links in a network We find that the oscillations 
substantially influence the solution of this problem. 

Lately it has been shown [13. Hsf that the average dis- 
tance {lij) between nodes i and j characterized by degrees 
ki and kj can be expressed as: 



{kj) = a - b\og{kikj). 



(1) 



This relation is fulfilled in wide spectrum of real-world 
networks and their models such as random graphs or 
Barabasi- Albert evolving networks however our re- 
cent research shows deviations from this scaling law 
which take a form of regular oscillations. This can be 
clearly seen at Fig. ^ where four real- world networks 
and two common known models have been gathered. 

To explain differences between Eq. and plots at 
Fig. ^we will use and modify results obtained by Fron- 
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FIG. 1: (color online) Mean distance {kj) between pairs of 
nodes i and j as a function of a product of their degrees 
kikj for 4 real- world networks and 2 models, (a) Astro coau- 
thorship network: = 13986 (fc) = 25.56, (b) English lan- 
guage word cooccurrence network TV = 7381 (fe) = 11.98, (c) 
Caribbean food web network N = 249 (fc) = 25.73, (d) Opole 
public transport network N = 205 (fc) = 50.19. (e) Erdos- 
Renyi random graph A'^ = 10000, (fc) — 40. (f) Barabasi- 
Albert network A*' = 10000 m = 20. All data are logarithmi- 
cally binned. For data sources see [l^. 



czak et al. in [5|. In the cited paper exact expressions 
for average path length using hidden variables formalism 
have been received. Assuming that each node i is charac- 
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FIG. 2; (color online) Comparison of two networks characterized by hidden variable distribution p{h) — {a — l)m°'~^h~°' for 
a = 3.0 and A*' = 10000 - upper row m = 2, lower row m = 40. (a) Samples of sparse (upper) and dense (lower) networks, (b, 
c, d) - detailed description in text and in caption of Figure]^ In case of plots (b) and (c) values of A have been chosen in such 
a way that the deviation is maximal. 



terized by its hidden variable hi randomly drawn from a 
given distribution p{h) and a connection probability be- 
tween any pair of nodes is proportional to hihj one can 
show that a degree distribution P{k) is: 



p{h). 



(2) 



The probability p*j{x) that vertices i and j are x-th 
neighbors can be expressed j5| as p*Ax) = F{x — 1) — 



F{x), where 



Fix) 



exp ( 
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and A 
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One should have in mind that 



the parameter _B is a "global" one (i.e. its value is de- 
termined only by the first and second moment of hidden 
variable distribution), while A can be called "local" - it 
depends on a specific product hihj. As the expectation 
value of average distance between i and j can expressed 
as (kj) = Y^lZT ^Pi]i^) = Yl^xZo" o'^e can write 

the following equation using Poisson summation formula 



n=l n=l V-'O 



, -In A -7 1 

x) cos{2mrx)dx j , 



(4) 



where 7 = 0.5772 is Euler's constant. If the average 
number of links is relatively small then, due to the gen- 
eralized mean value theorem, the term R can be ne- 
glected. Otherwise one must take into account at least 



the first term from the infinite series in Eq. (0J what 
leads to log-periodic oscillation (kj) with the period 
A\n{hihj) — InB (see dicussion below). Figure HI shows 




FIG. 3: (color online) Function F{x) (solid lines) and its linear 
approximation F{x) (dotted lines) for scale-free network with 
Of = 3, A'^ = 10000 and m — 40 calculated for three different 
values of product hihj (see labelled dashed lines at Fig. |5J: 
(A-) hihj — 11389 - maximal negative deviation from {kj} 
trend (Aq) hihj — 43249 - minimal (zero) deviation from {Uj) 
trend {A+) hihj = 198730 - maximal positive deviation from 
(lij) trend. Dashed line represents the point of inflexion Xi of 
F{x) {F{xi) = 1/e) used to calculate tangent of F{x). Inset 
shows Ri versus product hihj in case of m = 2 (dotted line) 
and m = 40 (solid line). 



a comparison of such oscillations in sparse 
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per row) and dense (m = 40, lower row) scale- free net- 
works characterized by a hidden variable distribution 
p{h) = {a — l)m°'~^h~°' with a = 3. The networks have 
been generated following the procedure C in |0| and 
represent the class of random networks with asymptotic 
scale-free connectivity distributions characterized by an 
arbitrary scaling exponent a > 2. At Fig. F{x) (dot- 
ted line) and p*j (solid hne) are presented together with 
points corresponding to discrete values of those functions. 
It is clearly seen that for m = 40 probability p*,^ is much 
more narrow than for m = 2, thus the slope of F{x) de- 
cays more rapidly. Figure |2: shows the cosine transform 
of F{x) given by the integral in Eq. Q). Depending on 
the shape of F{x), the amplitude of this transform can 
take small/large values resulting in small/large values of 
R. One should keep in mind that because R is in fact a 
sum of discrete values of a given transform taking only 
the first term in the sum (i.e. n = 1) is sufficient to obtain 
well approximated value of R (cf. points corresponding 
to discrete values of i?„ at Fig. [2t). Figure |21i shows 
resulting average distance {Uj) between nodes i and j 
as a function of hidden variables hihj without (dotted 
lines) and with (solid lines) included term R. In case of 
sparse network the R term can be omitted (curves over- 
lap), while for a dense one its value modifies the shape 
of {Uj) a lot. 

To obtain more quantitative results one should perform 
the integral in Eq. however it is not analytical, so in 
order to calculate the term R one can approximate F{x) 
with the following piecewise linear function F{x) 

1 X < Xo^ 

F{x) = { \{l -\nA- x\nB) xe<xa,xi>, (5) 
x > xi, 

where xq = (1 — InA — e)/lni3 and xi — (1 ~\nA)/\nB. 
Since the function F{x) is translationally invariant with 
respect to the argument x after rescaling the parameter 
A {F{x]A) = F{x — x';A'j) one can freely choose the 
point in which the slope coefficient is calculated as the 
tangent of F{x). In order to simplify the calculation we 
have chosen the inflexion point Xi of F{x). Functions 
F{x) and F{x) are presented at Fig. 13 Using Eq. (0) 
one can approximate terms Rn with 



lni3sin(^) 
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(21nA-2 + e) . (6) 



As one can see taking only the first term in the above se- 
ries is justified because next terms decay as Equa- 
tion lO allows us to make an immediate observation that 
deviations from Eq. |^ take the form of regular oscilla- 
tions along hihj axis with period equal to ln_B which 
increases with the heterogeneity of the networks (see in- 
set at Fig. EJ. This very value is forced by the discrete 
nature of distance in network - the period along (Uj) is 
equal to 1 and the tangent of the function {hj){hihj) is 



{\nB)~^ (see Eq. Q). One can also easily calculate that 
the deviation vanishes as long as {Uj) « A:/2 where k is an 
integer. For dense networks the amplitude of oscillations 
grows monotonically with B - that is why the effect of 
oscillations is visible only in sufficiently dense networks. 
Figure ^ presents a comparison of average distance {Uj ) 
versus hihj for scale-free networks with different scaling 
exponents a. As expected, the amplitude of oscillations 
rises with decaying a, which can be easily understood as 
lni?o,j > hiBa^ for ai < a2- Similar oscillations effects 
can also be observed for average path length (?), which 
value is obtained by integration of Eq. |0J over all pairs 
of products hihj (see inset (b) at Fig. EJ. 
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FIG. 4: (color online) Average distance {Uj) between nodes 
i and j versus their hidden variable product hihj (plots (a), 
(b) and (c)) or kikj (d) for scale- free networks of A'' = 10000 
nodes and a — 2.2 (a), a = 3 (b) and (d) and a = 4 (c) . 
Scatter data are obtained using algorithm presented in 
while solid lines have been calculated from Eq. (|1J where R 
is taken directly from Eq. JSJ. 



Let us now focus on possible applications of the pre- 
sented phenomenon. One of them can be a network opti- 
mization process which has been widely studied in recent 
years [TS fl^ [20j . Such an optimization is of common 
interest in many different areas, among them electrical 
engineering, telecommunication, road construction and 
trade logistics. The simplest model is based on the as- 
sumption of minimal transport costs. These costs in- 
clude two main aspects of network performance: a price 
of constructing and maintaining links between nodes and 
a price caused by communication delays of information 
transfer. The former one is proportional to the total 
number of links (we assume the same price for every link), 
while the later one should be proportional to the sum of 
the shortest existing connections between each two nodes: 
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C = (l-A)-(fc)+A(^ Ja). (7) 

Here A is a parameter controlling a ratio between prices 
of a single link and costs of communication delays. In 
fact one has to find an optimal link density considering 
two contradictive demands: a fully connected network 
with the shortest connections and a tree with the small- 
est number of links. A typical solution of this problem is 
a unimodal cost function with minimum at some inter- 
mediate value of (fc). Discrete effects in networks stud- 
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FIG. 5: (color online) Cost function C versus average degree 
(fc) for scale-free network characterized by A'^ = 10® nodes, 
a = 3.0 and A — 10"'*. Solid line is obtained assuming os- 
cillations' correction while dotted line neglects it. Inset (a) 
shows cost function C for identical network parameters A*' 
and a but with A = 5.4 ■ 10"". Left Y -axis corresponds to 
cost function with oscillations' correction (solid line) while 
right Y-axis corresponds to function that neglects the correc- 
tion (dotted line). Inset (b) presents average path length (/) 
versus system size N for scale-free network with a = 3 and 
m — 40: solid line is theory, while scatter data have been 
obtained using the hidden variable algorithm [l^ . 

led above can lead to reshaping of the total cost func- 
tion. As an example let us consider the scale-free network 
generated by method described in |l8| with parameters 
N = 10^ and a = 3. The cost function for this network 
is presented at Fig. O (we also show how this function 
could look like if we neglected discrete effects). One can 
see that neglecting of the correction term can lead to 
about 30% underestimation of optimal network density. 
Inset (a) at this figure obtained for another value of the 
parameter A shows different situation - instead of one 
global minimum we have now two well separated min- 
ima. The network administrator who tries to operate in 
accordance with the economic rule Q has just to remem- 
ber that the improvement of network efficiency can lead 
to a temporal increase of costs and can be discouraging 
since one has to pass over the cost barrier. Much sim- 



pler application of the observed phenomenon is presented 
at Fig. [Sja, where one can see that during the network 
growth there are regions where average path length in- 
creases slower (faster) which can encourage (discourage) 
the network administrator for further network expansion. 

To summarize: we have presented an explanation of 
the oscillations in relations {lij){kikj) and {1){N) ob- 
served in real-world networks starting from scientific col- 
laboration and ending at public transport systems. We 
have also provided examples of the influence this effect 
might have for simple optimization problems. 
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